non-linear transformation
ff1418e8cc993fe8abcfe3ce2003e5c5-AuthorFeedback.pdf
Below we clarify each question and we hope reviewers can raise their scores based on the responses. L205), we have provided the detailed experiment settings. The default ratio value is 50%, i.e., train 1 iteration passport-aware branch after training every 1 iteration In this case, the theoretical computation cost will be 2x. More importantly, this will not introduce any extra cost for deployment . Then it can be viewed as a special case (i.e., only the nonlinear transform We adopt a similar setting as the trigger-set based method [10].
2d6cc4b2d139a53512fb8cbb3086ae2e-Reviews.html
First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. This paper proposes a model for labeling images with classes for which no example appear in the training set, which is based on a combination of word and image embeddings and novelty detection. Using distances in th embedding space between test images and unseen and seen class labels, the approach is able to assign a probability for a new image to be from an unseen class. This is later used to decide which classifier to use (one designed for seen classes the other for unknown ones). Results on CIFAR10 are provided.
Enhancing Low-Rank Adaptation with Structured Nonlinear Transformations
Deng, Guanzhi, Liu, Mingyang, Wu, Dapeng, Li, Yinqiao, Song, Linqi
Low-Rank Adaptation (LoRA) is a widely adopted parameter-efficient fine-tuning method for large language models. However, its linear nature limits expressiveness. We propose LoRAN, a non-linear extension of LoRA that applies lightweight transformations to the low-rank updates. We further introduce Sinter, a sine-based activation that adds structured perturbations without increasing parameter count. Experiments across summarization and classification tasks show that LoRAN consistently improves over QLoRA. Ablation studies reveal that Sinter outperforms standard activations such as Sigmoid, ReLU, and Tanh, highlighting the importance of activation design in lowrank tuning.
- Asia > China > Hong Kong (0.04)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)
ff1418e8cc993fe8abcfe3ce2003e5c5-AuthorFeedback.pdf
Below we clarify each question and we hope reviewers can raise their scores based on the responses. L205), we have provided the detailed experiment settings. The default ratio value is 50%, i.e., train 1 iteration passport-aware branch after training every 1 iteration In this case, the theoretical computation cost will be 2x. More importantly, this will not introduce any extra cost for deployment . Then it can be viewed as a special case (i.e., only the nonlinear transform We adopt a similar setting as the trigger-set based method [10].
Interpretable non-linear dimensionality reduction using gaussian weighted linear transformation
Dimensionality reduction techniques are fundamental for analyzing and visualizing high-dimensional data. With established methods like t-SNE and PCA presenting a trade-off between representational power and interpretability. This paper introduces a novel approach that bridges this gap by combining the interpretability of linear methods with the expressiveness of non-linear transformations. The proposed algorithm constructs a non-linear mapping between high-dimensional and low-dimensional spaces through a combination of linear transformations, each weighted by Gaussian functions. This architecture enables complex non-linear transformations while preserving the interpretability advantages of linear methods, as each transformation can be analyzed independently. The resulting model provides both powerful dimensionality reduction and transparent insights into the transformed space. Techniques for interpreting the learned transformations are presented, including methods for identifying suppressed dimensions and how space is expanded and contracted. These tools enable practitioners to understand how the algorithm preserves and modifies geometric relationships during dimensionality reduction. To ensure the practical utility of this algorithm, the creation of user-friendly software packages is emphasized, facilitating its adoption in both academia and industry.
Wide Graph Neural Networks: Aggregation Provably Leads to Exponentially Trainability Loss
Huang, Wei, Li, Yayong, Du, Weitao, Da Xu, Richard Yi, Yin, Jie, Chen, Ling
Graph convolutional networks (GCNs) and their variants have achieved great success in dealing with graph-structured data. However, it is well known that deep GCNs will suffer from over-smoothing problem, where node representations tend to be indistinguishable as we stack up more layers. Although extensive research has confirmed this prevailing understanding, few theoretical analyses have been conducted to study the expressivity and trainability of deep GCNs. In this work, we demonstrate these characterizations by studying the Gaussian Process Kernel (GPK) and Graph Neural Tangent Kernel (GNTK) of an infinitely-wide GCN, corresponding to the analysis on expressivity and trainability, respectively. We first prove the expressivity of infinitely-wide GCNs decaying at an exponential rate by applying the mean-field theory on GPK. Besides, we formulate the asymptotic behaviors of GNTK in the large depth, which enables us to reveal the dropping trainability of wide and deep GCNs at an exponential rate. Additionally, we extend our theoretical framework to analyze residual connection-resemble techniques. We found that these techniques can mildly mitigate exponential decay, but they failed to overcome it fundamentally. Finally, all theoretical results in this work are corroborated experimentally on a variety of graph-structured datasets.
- Oceania > Australia > New South Wales > Sydney (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Neural Networks From Scratch in Python & R
I prefer Option 2 and take that approach to learn any new topic. I might not be able to tell you the entire math behind an algorithm, but I can tell you the intuition. I can tell you the best scenarios to apply an algorithm based on my experiments and understanding. In my interactions with people, I find that people don't take time to develop this intuition and hence they struggle to apply things in the right manner. In this article, I will discuss the building block of neural networks from scratch and focus more on developing this intuition to apply Neural networks. We will code in both "Python" and "R".
3 E's of AI: Creating explainable AI - IoT Agenda
Many companies rush to operationalize AI models that are neither understood nor auditable in the race to build predictive models as quickly as possible with open source tools that many users don't fully understand. In my data science organization, we use two techniques -- blockchain and explainable latent features -- that dramatically improve the explainability of the AI models we build. In 2018 I produced a patent application (16/128,359 USA) around using blockchain to ensure that all of the decisions made about a machine learning model, a fundamental component of many AI solutions, are recorded and auditable. My patent describes how to codify analytic and machine learning model development using blockchain technology to associate a chain of entities, work tasks and requirements with a model, including testing and validation checks. The blockchain substantiate a trail of decision-making.
A Comprehensive Guide to Neural Networks for Beginners
Deep learning is everywhere…from classifying images and translating languages to building a self-driving car. All these tasks are being driven by computers rather than manual human effort. And no, for doing so you don't need to be a magician, you just need to have a solid grasp on deep learning techniques. And yes, it is quite possible to learn it on your own! So, what is Deep learning? It is a phrase used for complex neural networks.
Understanding and coding Neural Networks From Scratch in Python and R
I prefer Option 2 and take that approach to learning any new topic. I might not be able to tell you the entire math behind an algorithm, but I can tell you the intuition. I can tell you the best scenarios to apply an algorithm based on my experiments and understanding. In my interactions with people, I find that people don't take time to develop this intuition and hence they struggle to apply things in the right manner. In this article, I will discuss the building block of a neural network from scratch and focus more on developing this intuition to apply Neural networks.